Extraction, Transformation, and Loading Processes

نویسندگان

  • Jovanka Adzic
  • Valter Fiore
  • Luisella Sisto
چکیده

ETL stands for extraction, transformation, and loading, in other words, for the data warehouse (DW) backstage. The main focus of our exposition here is the practical application of the ETL process in real world cases with extra problems and strong requirements, particularly performance issues related to population of large data warehouses. In a context of ETL/DW with strong requirements, we can individuate the most common constraints and criticalities that one can meet in developing an ETL system. We will describe some techniques related to the physical database design, pipelining, and parallelism which are crucial for the whole ETL process. We will propose our practical approach, “infrastructure based ETL”; it is not a tool but a set of functionalities or services that experience has proved to be useful and widespread enough in the ETL scenario, and one can build the application on top of it.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extraction-Transformation-Loading Processes

A data warehouse (DW) is a collection of technologies aimed at enabling the knowledge worker (executive, manager, analyst, etc.) to make better and faster decisions. The architecture of a DW exhibits various layers of data in which data from one layer are derived from data of the lower layer (see Figure 1). The operational databases, also called data sources, form the starting layer. They may c...

متن کامل

A UML Based Approach for Modeling ETL Processes in Data Warehouses

Data warehouses (DWs) are complex computer systems whose main goal is to facilitate the decision making process of knowledge workers. ETL (Extraction-Transformation-Loading) processes are responsible for the extraction of data from heterogeneous operational data sources, their transformation (conversion, cleaning, normalization, etc.) and their loading into DWs. ETL processes are a key componen...

متن کامل

Data Mapping Diagrams for Data Warehouse Design with UML

In DataWarehouse (DW) scenarios, ETL (Extraction, Transformation, Loading) processes are responsible for the extraction of data from heterogeneous operational data sources, their transformation (conversion, cleaning, normalization, etc.) and their loading into the DW. In this paper, we present a framework for the design of the DW back-stage (and the respective ETL processes) based on the key ob...

متن کامل

An Open Source ETL Tool - Medium and Small Scale Enterprise ETL(MaSSEETL)

In Data Warehouse (DW) environment, Extraction-Transformation-Loading (ETL) processes consumes up to 70% of resources. Data quality tools aim at detecting and correcting data problems that affect the accuracy and efficiency of data analysis applications. Source data imported into the data warehouse often has different quality, format, coding etc. In order to bring all the data together in a sta...

متن کامل

Data Warehouse Back-End Tools

The back-end tools of a data warehouse are pieces of software responsible for the extraction of data from several sources, their cleansing, customization, and insertion into a data warehouse. They are known under the general term extraction, transformation and loading (ETL) tools. In all the phases of an ETL process (extraction and exportation, transformation and cleaning, and loading), individ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009